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Abstract 


The purpose of this study was to determine the effects of 
different probability schedules of reinforcement over a series 
of successive reversals in a two-choice learning situation. 
Five equal probability schedules of reinforcement were employed: 
100:00, 90:10, 80:20, 70:30, and 60:40. The main dependent 
variable was the number of responses to the more frequently 
reinforced bar. The experimental Ss were rats of the Sprague- 


Dawley strain. 


The results showed that different probability schedules of 
reinforcement in successive probability reversals had differential 
effects on the terminal response levels. In regard to mean response 
level the groups divided their responses into proportions which 
tended to equal or match the probability of the scheduled events. 
Individual 5s did not reflect the mean terminal response levels of 
their respective groups, i.e., they did not match their respective 


probability schedules of reinforcement. 


The general shapes of the curves were not congruent with 
statistical learning theory. They were discrepant in that the 
initial response probabilities were extremely divergent from 
their predicted ones, and in that they did not reflect the usual 
negatively accelerated growth characteristics. An error anlaysis 
of the response patterns suggested that a pre-experimentally 


induced strategy, set or mode of responding may have been operating 
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throughout the entire reversal series. The findings of the 
error analysis along with the aforementioned findings were 
discussed in regard to the inadequacy of statistical learning 


theory to explain the results of this study. 
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Introduction 


The purpose of this study was to delineate the effects 
of different probability schedules of reinforcement over a series 
of successive reversals in a two-choice learning situation. 
The present study was prompted by some of the theoretical 
implications of Estes' statistical learning theory for behavior 
in two-choice learning situation. More specifically it is 
concerned with the extension of statistical learning theory into 
amore complex situation. The empirical basis for this study is 
derived from two related areas: probability learning and reversal 
discrimination. The relevant background information from these 
areas, based upon research with humans and animals, is discussed 


separately and integrated later. 


Probability Learning 


In statistical learning theory, behavior is seen as an 
essentially probabilistic phenomenon. The primary behavioral 
measure is taken to be the probability of occurrence of a member 
of some response class. The types of apparatus used in human and 
animal experiments have usually differed from experiment to 
experiment but have been designed to permit the S to make one of 
two possible choices or predictions on any given trial. Depending 
upon the procedure being used, the S, after making his choice, may 
or may not receive feedback as to whether or not his choice was 
correct or incorrect on that particular trial. In the "noncontingent" 
procedure, the information given to the S is independent of the 


choice or prediction he makes. In other words, regardless of 
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what choice or prediction was made on a particular trial, the Ss 
is informed what choice or prediction was correct on that trial. 
In the "contingent" procedure, the information received by the S) 
is dependent upon the choice he makes on any given trial. If he 
predicts the event that was programmed as the correct event, he 
is informed that he has predicted correctly; but if he predicts 
the other event, he is simply informed that his prediction was 
wrong. The events to be predicted are usually designated as Ey 


and E5, with EK, as the higher probability event and E, as the 


“I 
lower probability event. The two possible choices that can be 


made to Ey and E5 are designated as Ay and Ay respectively. 


Since 1950 there has been a strong interest and intensive 
development by various researchers in the area of "probabilistic" 
learning. The first research on this problem was a study by 
Brunswick (1939) using the white rat as an experimental subject. 
Brunswick, employing a standard one unit elevated T-maze, studied 
the rat's behavior in a two-choice left-right situation under three 
conditions of equal probability schedules, 100:00, 75:25 and 67733, 
in which the animals were allowed to correct for errors. He found 
that the 100:00 and 75:25 groups! final asymptotes tended to equal 
the probability of being rewarded for the left and right choices, but 
found no evidence of learning in the 67:33 group. Humphreys (1939), 
using humans as Ss asked them to predict whether a light would 
or. would not appear after a signal light was turned on. This 
experimental design was meant to be analogous to his research on 


conditioning of eyelid responses to a light followed by a puff of 
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air, in that the signal light represented the conditioned stimulus 

and the light to be predicted represented the unconditioned stimulus. 
Thus, the Ss‘ guesses (anticipations of the second light) were defined 
as conditioned responses. When he tested the Ss under two reward 
series 100:00 and 50:50, Humphreys found that Ss divided their 
predictions into proportions which tended to equal or match the 
proportion or probability of scheduled sequence of events, i.e., 


100:00 and 50:50. 


Interest in probability learning has increased greatly since 
the original studies of Brunswick (1939) and Humphreys (1939), 
resulting in many empirical investigations, and in the emergence 
of various formal mathematical models designed to explain the 
results of these studies. Estes (1950) added impetus by trying 
to formalize some of the basic ideas of learning theory. Bush 
and Mosteller (1951) started to investigate the use of Linear 
operators in analyzing learning data, and since then various other 
models for behavior (mostly with humans) in two-choice and multiple 
choice probability learning situations have appeared. (Anderson 
and Hovland, 1957; Davidson, Suppes and Siegel, 1957; Edwards, 1954; 
Siegel and Goldstein, 1959). This study is primarily concerned 


with the Estes’! model. 


In the Estes' model, the stimulus population is the central 
concept. The stimulus population is supposed to consist of a 
set of elements from which the S samples stimuli on each trial, 


and in turn these stimuli are connected in an "all or none" 
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fashion to the response just made. Each element is conditioned 

to one and only one response. This theoretical formulation of 
Estes has come to be known as "stimulus sampling theory". The 
basic notion of stimulus sampling theory is the conceptualization 
of the totality of stimulus conditions that may be effective during 
the course of learning. Any element in the stimulus population 
has equal probability of being sampled by the S on any given trial 
and these trial samples are drawn randomly from the population, 


with all samples of a given size having equal probabilities. 


Estes* model assumes that only some of the stimuli are 
conditioned by the S on any given trial, and in turn these 
sampled stimuli are connected or conditioned to responses made 


to ES or Be events. Thus the probability of a response occurring 


on any given trial is equal to the number of sampled stimulus 
elements that have been conditioned to a particular response, 
divided by the total number of elements sampled. For example, 

if the experimental design were a two-choice situation in which 

Ej; = .75 and Ep = .25, and the Ss were run for a thousand trials, 
Estes':model would predict that the final asymptotic response level 
would tend towards .75 and .25 for A, and A, respectively. This 
type of behavior in the two-choice learning situation has been 
labelled as “event or probability matching", that is to say, the 
subjects will make 75% of their responses to one choice and 25% 


to the other one. 


Learning in Estes' model can be depicted as a process of 


random sampling of the stimulus components which become connected 
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to responses in accordance with the probabilities of KE, and E, 
events. The rate of learning, 9, for any one S is equal to the 
proportion of stimuli sampled on each trial. From this it follows 
that the amount of learning, or rather the increase in probabilities 
of a response per trial, is a constant fraction of the amount 
remaining to be learned. Estes (1954) purports that, for any one 

S, 9 is a constant which does not change throughout an experiment. 
Due to the abstract nature of 9 it can only be determined post hoc. 
Once 9 is determined from one experiment however, it can be used 

@ priori in similar stimulus situations. The prediction of a S's 
performance in regard to both the shape of the learning curve and 


the final asymptotic level of responding is determinable from 9 


for any probability reinforcement schedule. 


The empirical findings of probability learning investigations 
in two-choice situations have both supported and conflicted with 


Estes’ probability matching phenomena. 


A number of experiments (Detambel, 1955; Estes and Burke, 
1955; Estes, Burke, Atkinson and Frankman, 1957; Estes and Straughn, 
1954; Gardner, 1957; Grant, Hake, and Hornseth, 1951; Humphreys, 
1939; Jarvik, 1951; Morse and Runquist, 1960; and Neimark and 
Shuford, 1959) using humans in two-choice noncontingent probability 
situations have found that Ss tend towards an asymptotic response 
level equal to the probability of reinforcement designated by the 


experimental design. 
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However, other studies (Bush and Mosteller, 1955; Detambel, 
1955; Edwards, 1956; Goodnow, 1951; Neimark, 1956; and Siegel, 
1959) involving a contingent procedure have demonstrated conflicting 
results. They found that Ss tended to "maximize" the choice 


associated with Ey events. 


The above studies suggest that the conflicting results 
(matching vs. maximizing) seem to be a function of contingent 
VS. noncontingent procedures. This conclusion is far from 
positive, as Gardner (1957) found no significant differences 


between contingent and noncontingent conditions. 


Another approach to decision-making situations was formalized 
by von Neumann and Morgenstern in 1947. Their theoretical game 
model predicts that a person will learn to maximize the expected 
frequency of correct predictions. Game theorists view probability- 
matching as an "irrational" strategy and believe that Ss should 
eventually settle on the "rational" strategy of choosing the more 


frequently reinforced event 100% of the time. 


From decision-making theory, Davidson, Suppes, and Siegel 
(1957) and Edwards (1954) have presented a hypothesis of "maximi- 
zation of expected utility" which seems to account for both types 
of predictions, probability-matching and maximization of the more 


frequent event. 


Edwards (1956), using human Ss in a two-choice learning situation, 


demonstrated that the asymptotic probability levels were higher than the 
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probability of reinforcement when both the probabilities and the 
amount of reward were varied. He concluded that both probability 
and amount of reward are important in determining choice behavior 
and that it is possible to compensate for changes in probability 


of reward by reverse changes in the amount of reward. 


Siegel and Goldstein (1959) tested Ss in a two-choice situation 
where they systematically varied the utility of the correct event 
as follows: (a) "no pay off" condition in which the only reward 
involved was feedback in regard to whether or not the subject's 
prediction was correct or incorrect; (b) a "reward" condition in 
which a monetary profit could be attained by a correct prediction 


"risk" condition in which a monetary gain 


of the events; and (c) a 
was involved with a correct prediction but also a monetary "loss" 
with an incorrect prediction. Their results showed that the 
probability of predicting the more frequent event would tend 
towards unity as the rewards (positive utility) and costs or risks 
(negative utility) of correct and incorrect predictions were 
increased. Thus the utility of the situation is defined as the 
"subjective value" the individual places on the outcome of the 


task and has resulted in the general hypothesis that the S will 


maximize expected utility in any case. 


Conflicting results have also been found where animals were 
used as experimental Ss. Brunswick (1939) using the rat in an 
elevated T-maze apparatus with a noncontingent procedure, i.e., 


the rats were allowed to correct for an error, found that the Ss 
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tended to match probabilities under various reinforcement schedules. 
In the light of present-day experiments, which usually run the iS) 
for 500-1000 or more trials, Brunswick's data are probably somewhat 
inadequate as they represent only five days of training. Lauer and 
Estes (1954) trained rats in a T-maze using a noncontingent procedure 
with a reinforcement schedule of .75 and .25 randomized with respect 
to both trials and daily blocks. They found that the asymptotic 
response levels were distributed around mean values of .75 and 

-25. Uhl (1963) using both contingent, i.e., rats were not 

allowed to correct after an error, and noncontingent procedure 

in a two-choice bar pressing situation, trained rats for 1,000 
trials with probabilities of .90, .80, .70, and .60 under four 
reinforcement conditions of sucrose concentration, 6, 12, 24, and 
48 per cent. Rats under the contingent design were more efficient, 
i.e., the Ss final asymptotic response levels to EL were higher 
than those in the noncontingent. In both cases the 8s tended to 
over-match their respective E, probabilities, with the noncontingent 
group the closest to demonstrating probability matching behavior. 
When the four conditions of sucrose reinforcement were compared 

over the last six days, no significant differences were found 


between the asymptotic levels of the four probability groups. 


Uhl's study contradicts the utility models of both Edwards 
(1956) and Siegel and Goldstein (1959) which were previously 
considered as explanations for the conflict between probability 
matching and maximization. Uhl concluded that approximately 1,000 


trials are necessary to evaluate the effect of different probabilities 


on behavior. 
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Parducci and Polt (1958) trained rats on a single unit 
elevated T-maze under both contingent and noncontingent procedures 
with E) = .85. They found that the contingent group maximized to 
the more favorable side whereas the noncontingent group manifested 


probability matching behavior. 


Stanley (1950) using equal probabilities of 100:00, 50:50, 
and 75:25 in a contingent situation found that rats in a T-maze 
tended to approach an asymptotic response level of 1.00 (.917, 
.958, and .917 respectively). The low value of .917 for the 1.00 
group was a product of the experimental design used by Stanley in 
which he controlled the number of rewards instead of the total 
number of trials. He also used a matched-litter technique and, 
as a result, the fast learners in the 1.00 group reached the set 
criterion within a few days and were dropped from the experiment, 
leaving the slow learners as strong contributors to the final 
asymptotic level. Bitterman, Wodinsky & Candland (1958) also found 
that rats maximized the higher probability choice when run under 


a@ contingent procedure. 


It is clear from the aforementioned studies on probability 
learning that Estes' prediction of probability matching has both 
confirmatory and contradictory evidence. The common element 
to all these studies is the simplicity of the learning task 
employed, while their differences are concerned with parameters of 


reinforcement and acquisition procedures. 
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Reversal Learning 


Most of the studies on repeated reversal learning have 
indicated that Ss manifest a systematic decrease in errors from 
the first to the last reversal. All of the first studies 
(Krechevesky, 1938; Buytendijk, 1930; Williams, 1942) demonstrated 
this general trend, but differences were noted in the Learning 
rate and final outcome of the reversal performance. Many 
inconsistencies that are evident in these earlier studies can 
probably be explained by the use of different apparatus and 
procedures, and by the different theoretical interests of the 


writers. 


Current research has tended to focus on factors that influence 
reversal learning, such as massed vs. spaced trials (North, 1950a 
and 1950b), number of trials per reversal (North,’1950a and 1950b), 
correction vs. noncorrection (North,. 1950a and 1950b), trial vs. 
performance criterions (Dufort, Guttman, and Kimble, 1954; North, 
1950a and 1950b; and Stretch, 1963) and effect of overlearning 
(Brookshire, Warren and Ball, 1961; Capaldi and Stevenson, 1957; 
Mackintosh, 1962; North, 1950a; North and Clayton, 1959; Pubols, 
1956; and Reid, 1953). Although there is some ambiguity as to the 
theoretical explanations regarding the effects of these factors on 
the final outcome of reversal performance, the general trend of 


decreasing errors over a series of reversals is a consistent 


finding. 
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The Problem 


In the studies achieving probability matching, the task in 
both humans and lower animals has been a relatively simple two- 
choice learning situation, as a left or right response is 


consistently associated with Ey events throughout the experiment. 


The reversal learning problem was chosen because of its 
Similarity to two-choice probability situations and because it 
is regarded as a more "complex" problem for the rat than a simple 
left-right choice discrimination problem (Koronakos & Arnold, 


LO5%,). 


The reversal problem provides "complexity" in that the Ej] 


events are not consistently associated with either a left or 
right choice throughout the entire experiment. On day one of 

the experiment E, events, the higher of the two probabilities, 
are contingent upon "left" choices, but on day two they are 
contingent upon "right" choices. It has been demonstrated by 
various investigators that the S's probability of predicting a 
given event, over a series of trials in which two alternative 
reinforcing events occur with fixed probabilities, tends to approach 
(match) the actual probability of the event. The experiment was 
designed to determine whether "probability matching" as predicted 
by Estes in a simple two-choice learning situation would occur in 


amore complex two-choice learning situation. 
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Method 


Design 


Thirty rats were given reversal training in a two-bar modified 
Skinner Box until they reached a set criterion of reversal performance. 
Following pretraining they were randomly divided into five proba- 
bility groups which differed in terms of the probability of rein- 
forcement (.60, .70, .80, .90, and 1.00) for the correct bar on 
all later probability-reversal training. Probability of reinforcement 
(P) refers to the more frequently reinforced response. The probability 


of the less frequently reinforced response was 1-P. 


Four sequences of 40 response alternatives were prepared 
for each of the five probability values. Each sequence was 
determined randomly within the restriction that the appropriate 
probability be maintained exactly over blocks of 20 trials. 
A random permutation of the four sequences was drawn for each 8 
for successive four day periods. Probability-reversal training 
consisted of 40 trials per day for 30 consecutive days. A non- 
correction procedure was used in which both bars were exposed on 
every trial. The bars were retracted for 5 seconds immediately 
upon depression of either bar whether a reinforcement was received 


or not. 


Subjects 


The experimental subjects were 30 male albino rats of the 
Sprague-Dawley strain. Their weight at the beginning of the training 


ranged between 180 and 200 grams (60 to 80 days). 
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Apparatus 


The basic equipment consisted of a modified Skinner Box 
equipped with a means of delivering liquid reinforcement (Lehigh 
Valley dipper). Two retracting bars (Lehigh Valley) were located 
three inches on either side of the dipper and three inches from 
the floor of the box. Approximately 10 grams force was required 
to depress a lever fully. The dipper dispensed .02 ml. of sucrose 
solution. The test apparatus was made of wood and had overall 
dimensions of 12 X 12 X 12 inches. The inside of the apparatus 
was painted a flat grey and was illuminated by two five-watt 
pulbs centrally located at both ends and two inches from the top 
of the box. The modified Skinner Box was placed in an insulated 
test chamber. The Ss could be observed through one-way glass 


located in the lid of the insulated test chamber. 


All E-controlled events, i.e., the sequence of reinforced 
response alternatives and intertrial interval, were operated by 
an automatic programming device. All responses and intertrial 
latencies were autamatically recorded by a 4-pen ink recorder and 


Hunter Klock-counter, respectively. 
Procedure 


Animals were housed for 10 days in individual cages on an 
ad libitum diet. The animals were maintained on 22-hour food 
deprivation after the tenth day. Each animal.was then gentled 


for 10 minutes a day for the next seven days. On the eighteenth 
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day the animals were placed in the experimental box, with both 
bars retracted, for 15 minutes for acclimitization purposes, and 
were then returned to their home cages and fed 20 minutes later. 
On this day the water bottle was removed from their cage and was 
replaced with another bottle containing 20 ozs. of 36% sucrose 
solution. The sucrose solution was removed after it had been 
consumed and the water bottles were replaced. This pre-experience 
with the sucrose solution was found to be necessary to facilitate 
pretraining as the animals did not seem to immediately prefer the 


sucrose solution upon first tasting it. 


Magazine Training 


The nineteenth and twentieth days consisted of magazine 
training in which the dipper click (noise made by the dipper 
when operating) was introduced and consequently followed by 
reinforcement. Twenty reinforcements were given on the first 
day. After twenty reinforcements the "click" then operated as 
discriminative stimulus which allowed the experimenter to shape 
alternation behavior, from one side of the box to the other, for 
twenty reinforced alternations. The bars were in the retracted 


position during all magazine training. 


Bar Training 


Day 1: Bar training consisted of "one-bar" training in which 
the bars were randomly presented one at a time for twenty reinforced 


responses. 
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Day 2: The Ss were given 10 consecutive reinforcements 
with only one bar protruding. After the tenth bar press, 
reinforcement on this bar was discontinued and the other bar 
was released. The former bar was retracted if the S shifted, or 
at the end of four more non-reinforced bar presses. The Ss 
then received 10 consecutive reinforcements on the protruding bar 
and were reversed twice more in the same way, thus receiving 


forty reinforced trials in one session. 


Day 3: The same as Day 2 except that the first 10 trials 


started with the opposite bar to that used in Day 2. 


Day 4: Consisted of 20 reinforced trials to each bar with 
one bar protruding on the first ten trials and with both bars 
protruding on the last ten trials. Reinforcement was reversed to 


the other bar on the 21st trial. 


Day 5: The same as Day 4 except that the initial trials 


started on the opposite bar. 


Day 6: Consisted of both bars protruding with only "one 
par" being reinforced for 4O trials. Both bars retracted after 


a bar press and were released five seconds later. 


During Days 1 to 5 the apparatus was manually operated. 
From Day 6 on the apparatus was automatically controlled by a 


programmed tape unit. 
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Reversal Training 


The Ss were run 4O trials per day receiving reinforcement 
on one bar only. The following day the reinforcement was reversed 
to the other bar for 40 trials. Reversal training continued 
until the S reached a criterion of no more than five errors per 
day with the last 9 out of 10 responses correct for two consecutive 
days. After the criterion was attained, the Ss were assigned 


at random to one of the five experimental groups. 


Probability-Reversal Training 


The S was placed in the test apparatus in which both bars 
were in the retracted position and after the compartment door 
was closed the bars were released. A noncorrection procedure was 
used in the present experiment, i.e., both bars would retract 
after the S pressed either one of the bars. The bars remained 
retracted for 5 seconds. A trial consisted of the time from when 
it took the S to depress one of the bars (which resulted in 
reinforcement or non-reinforcement), until the releasing of both 
bars which initiated the beginning of the next trial. The session 
was terminated after 40 trials and the S was returned to its home 
cage for feeding 20 minutes later. Each S was run through each 
training stage in a consecutive sequence from Day 1 of gentling 


to the final day of probability-reversal training. 
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Results 


Reversal Performance 


The unit of measurement was the number of Ay responses for 
each successive reversal. The response asymptotes at the end of 
reversal pretraining for the five probability groups were .90, 


91, .90, .9, and .W. 


The mean probability of AD responses over all trials for 
reversals 1 through 30 in blocks of two reversals is plotted 
in Fig. 1. To test the significance of a difference between 
the obtained curves and their predicted matching asymptotes for 
each treatment group, means for each individual S were calculated 
over the last 8 reversals and compared with the predicted mean 
asymptotic value (matching) for each respective group. Significant 
t's were found in the 1.00 and .60 groups (p ¢ .05), while the 
.90, .80, and .70 groups were not significantly different from 
their predicted mean asymptotic values (see Table 1). In general 
there seamed to be a tendency for the groups to under-match or 
approximately match their respective probability schedules of 


reinforcement. 


Table 2 presents a summary of the trend analysis of variance 
of the means shown in Fig. 1. It can be seen from Table 2 that 
there was a significant difference between group probability effect 
(p ¢.005). The general effect was for the probability of A, to 


increase with an increase in E, events (Fig. 1). The trend analysis 
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Table 1. 


Means of individual S over last 8 reversals 


and t of difference from predicted mean. 


Subjects 


Probability Groups 
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Table 2. 


19. 


Summary of the analysis of the trends of A, responses 


Sources of Variance 


Probabilities e sen 
| Error (1) 35.12 
Reversals ma ee Wf 
Linear 86.82 
Quadratic Aa 
Probability x Reversals 
Prob. x Rev. (linear) ny te: 
Prob. x Rev. (quadratic) 3.83 
Prob. (linear) x Rev. (linear) 56.54 
Prob. (quadratic) x Rev. 
(linear) 288.48 
Error (2) 9.31 


Significance Level 
FD &, 205 
Ep < .005 


Mean Square 
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for successive reversals was significant (p ¢.05). The trend 
analysis resulted in a significant linear effect (p ¢ .005) 

and a nonsignificant quadratic effect for ‘successive reversals. 
This means that the trend of the overall reversal means was 
essentially linear and there was no evidence of significant 
curvature. The general Probability X Reversals interaction was 
nonsignificant, but when broken down into linear and quadratic 
components, a significant linear effect was found (p Z .005). 
This means that the curves differed in linear slope but not in 
the amount of curvature. The Linear X Linear and Quadratic X 
Linear Components of the Probability X Reversals interaction 
were both significant at the p ¢ .005 level, indicating that the 
curves showed a significant change in slope as the probabilities 
increased from .60 to .80, with the .90 and 1.00 groups levelling 


off somewhat, resulting in a significant curvature effect. 


Significance of differences between last 8 reversals was 
tested with Duncan's New Multiple Range Test (Edwards, 1960, 
pp. 136-140) to detect differences between the final asymptotic 
response levels of the 5 probability groups (Table 3). The 
differences between means are summarized by the underscoring 
at the bottom of Table 3. Any means underscored by the same 
lines do not differ significantly (p ¢ .05). It can be seen 
from Table 3 that no differences occurred between the 1.00 and .9 
groups; and .90 and .80 groups. Groups 1.00, .90, and .80 all 


differed significantly from groups .70 and .60. 
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Table 3. 
Duncan's New Multiple Range Test on the last 8 reversals. 
Probability Groups 
.60 2D . 80 . 90 1.00 


means 22.71 26536 31.96 35.42 36.83 


Any two treatment means not underscored by the same 
Live are signiticantly different. 


Any two treatment means underscored by the same line 
are not significantly different. 


jer aorta 
Table 4. 


Summary of the Frequency Plot of individual Ss A, responses over 
the last 10 reversals 


1) Matching--------- 15% 
2) Within 4%-------- 254, 
P= 1.00.3) Within 6%-------- 52% 
4) Within 8%-------- 63% 
1) Matching--------- 10% 
2) Within 4%-------- 20%, 
P= .90 3) Within 6%-------- Lo, 
4) Within 8%-------- 62%, 
1) Matching--------- 15% 
2) Within 4%-------- 38% 
Bienes 8093) Within, 6p-------- 47% 
4) Within 8%-------- 57h 
1) Matching--------- 10% 
2) Within 4%-------- 22% 
P= .70 3) Within 6%-------- 33% 
4) Within 86-------- 37% 
1) Matching--------- 5% 
2) Within 4%-------- OH 
P= .60 3) Within 6%-------- 17% 
4) Within 84-------- 204, 
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To demonstrate the effect of the different probabilities 
on the shape of the learning curve, the mean probabilities over 
trials (within days) were plotted for the five groups and are 
shown in Figs. 2, 3, and 4 for blocks of reversals 1 to 5, 

13 to 17, and 26 to 30, respectively. Trials 1, 2, 3, 4, and 5 
are plotted individually and the remaining 35 trials are grouped 
in blocks of five trials. In general the final asymptotic 
response levels of the five curves tended to resemble matching 
as predicted by Estes’ statistical learning theory. The 1.00, 
.70, and..60 groups consistently under-matched their predicted 

. asymptotes over the reversal series, while the .90 group tended 


to over-match its predicted asymptotic level. 


A liberal interpretation of the curves would support Estes' 
prediction, but it does not support his predictions of response 
levels at the beginning of each reversal. In the present study, 
statistical learning theory would predict initial response 
probability (Po) at the beginning of each reversal to equal 1-Pa 
where Pa equals the asymptotic response level of the previous 
session. If the asymptotic level equalled .80, then the S when 
reversed should manifest a Po = .20. To demonstrate this relation- 
ship, the Po to the previous day's final asymptote, the predicted 
Po's and observed Po's have been plotted for the 5 groups and 
are presented in graph forms in Figs. 5, 6, and,7. In reversals 
1 to 5 there is an obvious discrepancy between predicted and 
observed Po's for all five groups. In reversals 13 to l1/ a large 


difference exists in the 1.00, .90, and..80 groups with the. .60 
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Mean probabilities within days averaged over 
reversals 


* 
a 
-— 
=e 
ee 
-_ 


~~) wee 
iss 


yer coh ills 


im, 
~ p—— “a ” a 
At, $ 7 
i sa ee —— a oo ae 
b- 
c. tn, : 
, a . 
J 
Ps 
. 
a : 
. = * 
7 “A 
2 
_ a — -, 
o Pa _ = 
—/ 
= a _ = a 
“Se Pp 
i y ime. " 
~ * a, 
4 ‘ 
i . 
a - gy ~% 
\ , ” 
x — —« © of * # 


| oopmerrset 
es ap eee as BS err: 


7 


as rors 


1e 


7 @ 


Po 


1.00 9 . 80 Pade, .60 


Fo 


1.00 .90 .80 ~70 .60 


BIgin6 Probability Groups 


Po 


1.00 -V . 80 -70 .60 
Fig. 7 Probability Groups 
Mean predicted and observed Po values for the 
five probability groups. 
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group showing a smaller difference and the .70 group showing 

very little difference. The differences in the last block of 
reversals tends to approximate the differences reflected in the first 
block of reversals where the smallest difference was about 20 per cent. 
Although no appropriate statistical analysis was performed, the data 
Strongly suggested that the initial response probabilities (Po), 

as depicted by the curves were not congruent with Estes' prediction 

of Po. As can be seen in Fig. 6 discrepancies ranged as large 


as 68% for the 1.00 and .90 groups in reversal 13 to 17. 


Comparison of the mean AL curves in Fig. 1 with the within 
reversal curves of Figs. 2, 3, and.4 prompted an inspection of each 
S's A, response level. To demonstrate what proportion of the Ss 
were actually under-matching, matching or over-matching, total Ay 
responses for each S within the different groups were plotted in 
a frequency distribution for each of the last 10 reversal sessions. 
The values presented in Table 4 are the percentage of Ss that 
manifested matching or that were within 4, 6,.and 8 per cent on 
either side of the matching asymptote. Examination of Table 4 
clearly reveals that the highest percentage of Ss manifesting 
matching behavior was only 15 per cent. Furthermore, that increasing 
the interval to 8 per cent on either side of the matching value only 
accounted for 63, 62, 57, 37, and 22 percent of the Ss for the 1.00, 
90, .80,..70,:and .60 groups respectively. In terms of mean response 
levels the individual Ss were not matching their respective probability 
reinforcement schedules, as predicted by statistical learning theory. 


These results clearly suggest that the tendency for the curves of 
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Fig. 1 to match their respective schedules of reinforcement was 
an obvious artifact due to the averaging of individual ss! Ay 


responses to attain group means. 


Error Analysis 


The error analysis was performed in an attempt to determine 
&@ possible mechanism or mechanisms responsible for the Ss" behavior. 
It was hoped that the analysis would shed some light on the 
discrepancies between predicted and observed Po values. The data 
were analyzed to determine if the groups differed with respect to 
the number of consecutive unrewarded A, responses made at the beginning 
of each session. According to statistical learning theory the five 
groups should differ in their Po values. These should differentially 
affect the number of consecutive unrewarded Aj responses at the 
beginning of a session. The mean number of consecutive unrewarded 
responses made to the Ay bar are represented in Table 5. No 
Significant t's were found between the five groups. The results 
suggested that a common mechanism may have been operating independent 


of the probability schedules of reinforcement. 


Inspection of the response records suggested that a "win-stay, 
lose-shift" strategy or mechanism may have been operating during 
the early trials of the reversal sessions. Im the following analysis 
of the data the term "strategy" is used only to represent the 
possible response patterns to the reinforcing events. A win-stay, 


lose-shift strategy operates on a recency basis, i.e., the S's 
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Table 5 


Mean consecutive unrewarded responses before 


shifting to Ay 


Probability Groups 


unrewarded responses 


Table 6 


Mean per cent number of times S used a W-S, L-S 
strategy in blocks of ten trials 


3 
a vv 
= 1.00 13 - 17 
26:-= 30 
1-5 


26 - 30 

oS 

= .80 13 =. 17 
26 - 30 

tes 
= ./0 i he eel og 
26 - 30 

i Seca, 

at 60 13+ 17 


ad 7 
Hosi? té soduxny ¢neo.1aq ASP C2 
 eabaid ni yaar 


28. 


behavior is a function of whether reinforcement or nonreinforcement 
occurred on only the previous trial. The following analysis 
attempted to isolate this effect, although clearly it does not 
imply that only this strategy was operating. In the present 
experiment there were 8 possible response patterns to the Ej and 
E, events, four of which would be indicative of this strategy. 

The four pertinent strategies (marked with asterisks) are illustrated 
in Fig. 8. The percentage of trials on which the Ss manifested a 
win-stay, lose-shift strategy is presented in Table 6 in blocks 

of ten trials for reversal blocks 1 to 5, 13 to 17, and 26 to 30. 
From this table it is obvious that the 1.00, .90,. and .80 groups 
were predominantly using this strategy throughout the entire 
probability-reversal training. Although the .70 and .60 groups 
were lower than the above three groups, they still consistently 
manifested this strategy over 50% of the time. In general, then, 
the data seemed to support a win-stay, lose-shift strategy. At 

the same time it is also obvious that as the probability of Eo 
events (1-P) increased, there was a decrease in the number of times 


the Ss manifested this response pattern. 


To demonstrate this effect the E, and E,, events of the win-stay, 
lose-shift strategy were analyzed separately, and are presented in 
Table 7 and 8 and Figs. 9 and 10. For Ey events, Table 7 and Figure 9, 
it can be seen quite clearly that there is an obvious relationship 
between the probability reinforcement schedule and the percentage 


of times the Ss manifested a win-stay, lose-shift strategy. The analysis 
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ss ae Stay * 
Ay -rr--- Shift 
ay 
Ao bea = Mad Sst ay 
Ap ------ Shift * 
Ay ------ Stay 
Ay ------ Shift * 
EK 
. See neh Stay * 
Ap se+e== Shift 


* Win-stay, Lose-shift 


Fig. 8. The 8 possible response alternatives 
to Ey and Ep events. 
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Table 7. 


Mean percent of times the Ss manifested a W-S, L-S 
strategy to E, events in blocks of ten trials over 
reversals 1 to 5, 13 to 17, and 26 to 30. 


Reversals 1 to 5. Reversals 13 to 17 Reversals 26 to 30 
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Fig. 9. Mean percent of times Ss manifested a W-S, L-S 
strategy to E, events averaged over all reversals 
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Table 8. 


Mean percent of times the Ss manifested a W-S, L-S 
strategy to Ey events in blocks of ten trials over 
toy sco. 17 end! 26) to 30, 
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Fig. 10. Mean percent oi times Ss manifested a W-S, L-S strategy 
to Eo events averaged over all reversals. 
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of the E, events did not support this interpretation. In general, 
the results of the win-stay, lose-shift analysis indicated that the 


Ss "attended" to the E, events and tended to ignore the E, events, 


2 
while, at the same time, their behavior seemed to be closely related 


to their respective probability schedules of reinforcement. 


The mean number of times the © shifted from A, to A, during 


2 
an experimental session was calculated for the 30 reversals. 

The results are presented in Table 9. The median test (with Yates 
Correction) was conducted to test the differences among the five 
groups. The .70 group made significantly more shifts than the 
1.00 and .90 groups, while the .60 group was only significantly 
different from the 1.00 group (p ¢.05). The .80 and .9 groups 
showed no significant differences in shifts from the 1.00 and .60 
groupes. In lieu of the fact that the 1.00 group tended to shift 
4.6 times in a session even though there were no programmed shifts, 
it would seem reasonable to subtract 4.6 to obtain a base or 


operant rate of shifting from the rest of the observed values. 


The corrected values are also presented in Table 7. 


The mean cumulative response latencies within a session 
averaged over the 30 reversals for the five groups are presented 
in Fig. 11 and Table 10. The data was subjected to a trend analysis 
of variance which is presented in Table 11. No significant differences 
were found between reversals. The trend analysis resulted in sig- 


nificant linear, quadratic, and cubic components (p¢ .005) and a 
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nonsignificant quartic component. Inspection of Fig. 11 reveals 
that latency was not a simple increasing monotonic function with 


a decrease in the probability of E, events. 


3h. 


Table 9 


Mean number of shifts averaged over all reversals. 
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Table 10 


Mean cumulative latency within sessions 
averaged over all reversals. 
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Table 11 


Trend analaysis o:' latency scores. 


Source of Variance df Mean Square F 


A. Reversals 5 41.56 1.59 


B. Latencies 


Linear 1 1520.06 5.83 * 
Quadratic ‘a 2368.05 9.09 * 
Cubic i 2088.60 8.01 * 
Quartic sf 237.74 og1 
Error 20 260.60 
2m Dr e200 

800 

n 

Le) 

§ 700 

oO 

D 
600 


1.00 90 . 80 -70 60 


Fig. 11. Mean curmlative latencies within 
sessions averaged over all reversals. 
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Discussion 
The major findings of this study can be summarized as follows: 


(a) Different probability schedules of reinforcement had 
differential effects on the final asymptotic response levels in 


successive probability reversals. 


(b) The final mean asymptotes of the groups appeared to 


match their respective probability schedules of reinforcement. 


(c) Individual Ss did not match their respective probability 


schedules of reinforcement. 


(d) The Po values for each probability group, as predicted 
by statistical learning theory, did not coincide with the observed 


Po values. 


(e) The error analysis revealed: 

(1) That there were no differences between the 
five groups in the number of consecutive 
unrewarded A, responses made at the 
beginning of each session before shifting 
to the A, bar; 

(2) That the groups differed in the percentage 
of times they used a win-stay, lose-shift 
strategy during the reversal sessions; and 

(3) That the Ss "attended" more to the E, events 


than the Eo events. 
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The findings of this study in regard to the final mean 
asymptotic response level do not agree with those of Uhl (1963) 
who trained rats in the same apparatus but in a 2-choice probability 
learning situation for 1,000 trials, Edwards (1961) who trained 
human Ss for 1,000 trials, or Wilson (1960) who trained monkey Ss 
for 1,024 trials. In general, the results do not support the 
"maximization" hypothesis of decision-making theory as supported 
by Davidson, Suppes, and Siegel (1957); Edwards (1954); Siegel 
and Goldstein (1959); and Stanley (1950). The final asymptotic 
levels of the curves in Fig. 1 tended to partially support Estes' 
prediction that the final asymptotic level of Ay responses will 
tend to approach the probability of reinforcement for that event 
occurring. Examination of individual Sis A, response levels clearly 
revealed that the Ss were not matching their respective probability 
schedules. The results do not support the findings of Estes (1954) 
which resulted in individual Ss conforming to the matching prediction 
of statistical learning theory. As emphasized by Anderson and Grant 
(1957, 1958) and Anderson (1959, 1960, and 1962), data when analyzed 
over trials, i.e., averaged response probabilities, can lead to 
erroneous conclusions when testing the validity of statistical 


learning theory. 


In general, the first impression of the curves seemed to indicate 
that something like matching was occurring, but comparison of the 
observed Po values to the predicted Po values revealed that there 


was an obvious discrepancy between the two sets of values. This 
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discrepancy between the observed and predicted Po values and 
the generally poor fit of the curves when plotted over trials 
combined with the evidence that few Ss actually manifested 

matching behavior strongly suggests that statistical learning 


theory does not adequately explain the present findings. 


Because. the initial behavior (Po) atithe beginning of each 
reversal session did not concur with statistical learning theory 
expectations, an attempt was made to identify a possible mechanism 
or mechanisms to explain the discrepant results. From statistical 
learning theory the five groups would be expected to differ in 
the number of consecutive unrewarded Ap responses at the beginning 
of a session. No significant differences were found between any of 
the five groups. This suggested that a common mechanism was 
operating independent of the probability schedules of reinforcement, 
at least in the first five trials. This analysis tended to support 
the high Po values and consequently contradicts statistical learning 


theory in regard to its prediction of initial reversal behavior. 


Overall and Brown (1957) pointed out that probability models 
require a minimal amount of past experience upon which prediction 
can be based, consequently, they deleted predictions made for the 
first five responses of each day. Anderson* has also suggested 
that one should consider discarding the first five or even 10 


trials of each day for the same reason as mentioned by Overall and 


* Personal communication 
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Brown. Also he suggested the Ss are sampling from a stimulus 
population that consists of more than just the stimuli that were 
available for conditioning in the latter part of the previous 
reversal. Examples of these stimuli that were not part of the 
stimulus population during the entire reversal session are as 
follows: (a) internal stimuli (kinesthetic) arising from the 
handling of the Ss when being removed from the cage and placed in 

the apparatus; (b) slight changes in the smell of the test chamber 
due to other Ss; (c) lingering odor of E's hands in the test 

chamber; and (d) internal stimuli arising from exploratory behavior. 
All of these sources could contribute to "extraneous" stimuli 

which were not part of the stimulus population at the end of the 
previous reversal session. It also seems reasonable to hypothesize 
that these extraneous stimuli would probably predominate the stimulus 
population the Ss were sampling from during the early trials of 

each daily session. Assuming that the position preferences of the 
groups would have little or no effect on response choices, extraneous 
stimuli would tend to result in Po values around .50. The above 

may be considered a possible explanation of the observed high Po 


values in the present study. 


If Anderson's, and Overall and Brown's procedure in regard to 
the first five or ten trials is applied to the present findings, 
then the curves in Figs. 2, 3, and 4 would manifest little or no 
negatively accelerated growth, and would have extremely divergent 


Po values. The forms of the acquisition curves and the asymptotic 
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response probabilities predicted from statistical learning theory 
were not in good agreement with the data. It was hypothesized 

that the data may be explainable on a recency basis rather than 

a frequency or probability model. In the present study, for the 

Ss to reach criterion in the reversal pretraining, their behavior 
would have had to be determined by the most recent events and 

not by the past events of the previous reversal (frequency). 

The attainment of criterion could be explained by a win-stay, 
lose-shift strategy which operates on a recency basis. If this 
strategy did in fact exist it seems plausible that it could have 
persisted through all of the probability-reversal training. 

Although no adequate statistical analysis could be performed 

on the data, the results presented in Table 4 strongly supported 

the hypothesis that this strategy persisted throughout the 

entire training. Goodnow and Pettigrew (1955) have suggested 

that whatever pre-experimental response tendencies, sets or strategies 
Ss bring to the task may persist throughout the experiment proper. 
Anderson (1960) in studying the effect of first-order conditional 
probability in a two-choice learning situation, found that high and 
low conditional probability sequences produced different acquisition 
behavior. More importantly, this difference was maintained at a 
high level over several hundred transfer trials on a 50:50 random 
sequence common to all conditions. The results of Goodnow and 
Pettigrew (1955), Friedman et al (1960), and Anderson (1960) support 
what Anderson has called "repetition responding’, that is, predicting 


next that event which occurred last. In view of this supporting 
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evidence it seems reasonable that the initial reversal training 
trials produced a set or mode of responding and that this set 


persisted throughout the entire probability-reversal training. 


and E. events also demonstrated that 


The analysis of the Ey D 


the percentage of time the Ss manifested these strategies was 
monotonically related to the probability schedules of reinforcement. 
Analyzing the BE, and E, events separately also showed that the Ss 
"attended" more to the E, events than the E5 events. More 
specifically there seems to be a linear relationship between the 
probability schedules and the effect of reinforced and nonreinforced 
responses; i.e., as the probability of an E, event increases, the 
effect of a nonreinforced trial in changing behavior (shifting) 
decreases and vice versa. The above results are congruent with 
the findings of Atkinson (1956), Neimark (1956), and Millward 
(1960) on the effect of nonreinforced trials on behavior. Similarly 
it has been hypothesized that a reinforced trial has an enhancement 
effect if it occurs in a series of nonreinforced trials (Estes and 


Burke, 1953). 


The results so far discussed indicate that the behavior of the 
Ss was to a considerable degree being controlled by a recency or 
postremity principle, but at the same time was quite dependent on 
the probability of reinforcement. This is not in line with a statistical 
learning theory interpretation of the data. The analysis of the 


data does, however, strongly suggest that the development of a 
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win-stay, lose-shift strategy occurred during pretraining and 
persisted throughout the 30 reversals. The persistence of this 
strategy would in part account for the incongruities of the curves, 
such as, the high Po values and nonxnegatively accelerated growth 


characteristics. 


The efficacy of this strategy for providing information to 
the Ss in terms of maximizing reinforcements would decrease with 
an increase in the probability of Ep events. In the case of the 
1.00 group the utility of this strategy would be optimum; i.e., 
a reinforced trial always follows a reinforced trial and never 
follows a nonreinforced trial, whereas in the .60 group a reinforced 
trial does not always follow a reinforced trial and can follow a 
nonreinforced trial. If the utility of this strategy is considered 
in regard to the amount of information that is significant in 
acquiring the maximum number of reinforcements; i.e., what bar is 
the pay-off bar today, it should be apparent that the amount of 
information decreases with an increase in Ep events and that the 
probability schedules of reinforcement should interact with this 
strategy in the determination of the A, responses and the final 


asymptotic response level. 


The present study supports Overall and Brown's (1957) view 
that “until frequency and probability theories include a recency 
principle, it appears that they will be neglecting an important 


consideration, as shown by the growing body of data on the relative 
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importance of recent events in the learning sequence." An overview 
of the findings of this study points out that analysis of sequential 
dependencies or specific response patterns, such as win-stay, lose- 
shift strategy, are probably more effective in the description and 
analysis of behavior than using the mean asymptotic values of the 


learning curves. 
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Summary and Conclusions 


To determine the effects of different probability schedules 
of reinforcement over a series of successive reversals in a 
two-choice learning situation, five groups of rats were run in a 
modified two-bar Skinner Box under 1.00, .90, .80, .70,: arid. «60 


equal probability schedules or reinforcement. 


The different schedules of reinforcement had differential 
effects on the terminal response levels. In general, the final 
mean asymptotes of the different groups tended to match their 
respective probability schedules of reinforcement. The inadequacy 
of analyzing grouped data in terms of mean A, response levels 
was -demonstrated by the fact that individual Ss did not match 


their respective probability schedules of reinforcement. 


A statistical learning theory interpretation of the data 
in regard to the Po values was also found to be inadequate. The 
extremely high Po values were somewhat clarified by an error analysis 
of the data. This analysis supported a win-stay, lose-shift strategy 


interpretation of the data. 


In general, the findings of the error analysis suggest that 
the results of the present study may best be interpreted as an 
interaction of a pre-experimentally induced strategy or pattern of 
responding with different probability schedules of reinforcement. 
It was found that this interpretation was congruent with the general 
shape of the curves, whereas the statistical learning theory approach 


was not. 
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It was concluded that an error or sequential response analysis 
of the data would probably be a more fruitful approach for this type 
of study as averaged or mean response data may or may not reflect 
the underlying mechanism, sets, or strategies controlling behavior. 
The present study supports the contentions of some researchers 
(Anderson, 1960; Anderson and Whalen, 1960; Engler, 1958; Overall 
and Brown, 1957; and Witte, 1961). Their view is succinctly 
summarized by Witte (1961), "that since behavior is apparently 
a function of more remote events, as well as the immediately preceding 


event, an analysis of sequence effects is mandatory. " 
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APPENDIX A 


Individual A, responses for the 1.00 group 
for the 30 probability-reversals 
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APPENDIX A(continued) 


Individual Ay responses for the .90 group 
for the 30 probability-reversals 
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APPENDIX A (continued) 


Individual A, responses for the .80 group 
for the 30 probability-reversals 
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APPENDIX A (continued) 


Individual A; responses for the .70 group 
for the 30 probability-reversals 
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APPENDIX A (continued) 


Individual A, responses for the .60 group 
for the 30 probability-reversals 
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